arXiv:1501.03547vl [cs.NI] 15 Jan 2015 


Cloud-Assisted Remote Sensor Network Virtualization for Distributed 

Consensus Estimation 


Sherif Abdelwahab*, Bechir Hamdaoui*, and Mohsen Guizani^ 

* Oregon State University, abdelwas,hanidaoui@eecs.oregonstate.edu 
^ Qatar University, mguizani@ieee.org 


Abstract 

We develop cloud-assisted remote sensing techniques for 
enabling distributed consensus estimation of unknown pa¬ 
rameters in a given geographic area. We first propose 
a distributed sensor network virtualization algorithm that 
searches for, selects, and coordinates Internet-accessible sen¬ 
sors to perform a sensing task in a specific region. The 
algorithm converges in linearithmic time for large-scale net¬ 
works, and requires exchanging a number of messages that 
is at most linear in the number of sensors. Second, we design 
an uncoordinated, distributed algorithm that relies on the 
selected sensors to estimate a set of parameters without re¬ 
quiring synchronization among the sensors. Our simulation 
results show that the proposed algorithm, when compared 
to conventional ADMM (Alternating Direction Method of 
Multipliers), reduces communication overhead significantly 
without compromising the estimation error. In addition, the 
convergence time, though increases slightly, is still linear as 
in the case of conventional ADMM. 

1 Introduction 

As the Internet of Things (loT) emerges, a rapid growth of 
the number of sensor-equipped things (e.g., smart-phones, 
tablets, etc.) is observed where it became important to re¬ 
think the way conventional sensor-based management tech¬ 
niques are designed. Although sensor-based distributed 
sensing has already been investigated in the past, cloud- 
assisted remote sensing is a new paradigm that capitalizes 
on the capabilities of loT to enable what is called Sensing 
as a Service [1]. Cloud-assisted remote sensing based dis¬ 
tributed parameter estimation is one example of such ser¬ 
vices. 

Traditional sensor networks depend primarily on the so¬ 
phistication and accuracy of the sensory devices themselves 
to perform sensing tasks and meet quality of service require¬ 
ments. In the Sensing as a Service model, cloud-based sen¬ 
sor networks rely mainly on swarms of participatory sensors 
to perform remote sensing tasks. Unlike traditional sen¬ 
sory devices, participatory sensors, though come with new 
opportunities, present key challenges, mainly pertaining to 
their sporadic availability and unpredictable mobility. 

In the cloud-based remote sensing paradigm, a cloud 
agent (or manager) is responsible for receiving and handling 


remote sensing task requests from cloud clients. When a re¬ 
quest is granted, the agent is also responsible for virtualizing 
a sensor network to perform the requested sensing task. An 
example of such tasks is distributed consensus estimation, in 
which an unknown set of parameters need to be estimated. 
Imagine for example the virtual sensing task in which it is 
required to track the location of an RFID-tagged person in a 
large campus. A group of smart-phones connected through 
machine-to-machine physical links can estimate the location 
of a person based on the RFID signal strength each smart¬ 
phone receives and measures. In this case, a virtual sens¬ 
ing request is sent to the cloud agent which dispatches the 
request to few smart-phones in the swarm. These smart¬ 
phones autonomously search for a group of smart-phones 
that can read RFID tags and are willing to participate in 
the sensing task to estimate the distance based on received 
signal strengths. Among all these smart-phones, a subgroup 
of them is then selected to form a virtual sensor network con¬ 
nected according to a topology that is to be specified by the 
cloud agent itself. Such virtual sensor network distributedly 
and cooperatively estimates the location of the person us¬ 
ing distributed linear consensus algorithms (see for example 
[iQiin]), and continuously sends an update of the location 
of the person to the cloud agent, which in turn sends it to 
the cloud client. 

In this paper, we develop cloud-assisted remote sensing 
algorithms that enable distributed consensus estimation of 
unknown parameters. Specifically, we propose: 

• An efficient network virtualization algorithm that can 
search and select sensors from the swarm to form a vir¬ 
tual sensor network that can perform a sensing task. 
The algorithm consists of selecting a set of sensors that 
are willing to participate in the requested remote sens¬ 
ing task, and finding optimal one-to-one mappings be¬ 
tween virtual and participatory sensors. 

• An efficient estimation algorithm that relies on the vir¬ 
tual sensor network, formed by our proposed virtual¬ 
ization algorithm, to estimate a set of unknown pa¬ 
rameters in a distributed way and without requiring 
synchronization among participatory sensors. 

Our results show that given a virtual sensing task requir¬ 
ing g sensors and a swarm of n sensors where a pair of 
sensors is considered connected if the sensors are within a 
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distance r from one another, our virtualization algorithm 
finds the capabilities of the sensors in 0{r~^\ogn) with 
an average number of messages per sensor of 0(1)- Our 
results also show that our proposed algorithms achieve a 
virtualization benefit that is very close to an upper bound 
in 0(max{r“^nlogn,g^}) time with a 0(n) worst case av¬ 
erage number of messages per sensor. Finally, our results 
show that our proposed distributed parameter estimation 
algorithm has a linear convergence time and incurs com¬ 
munication overhead that is at least an order of magnitude 
lesser than that incurred by the conventional Alternating 
Direction Method of Multipliers (ADMM). 

In our design, the cloud agent does not need to have full 
knowledge of the underlying swarm of sensors, including its 
topology, sensor capabilities, and sensor availability. We 
only assume that the cloud agent has a direct communica¬ 
tion with some, but not necessarily all, sensors. Each sensor 
has to exchange small sized messages with only its direct 
neighbors to diffuse information across the whole or part of 
the swarm. We also assume that the swarm is dynamic by 
nature in that sensors’ availability, locations, connectivity, 
and capacities may change sporadically and unpredictably 
over time. 

This paper is organized as follows: In Section [2l we 
overview our algorithmic design, describe the requirements, 
and state the related work. In Section |3l we propose and 
present our sensor network virtualization algorithm. In Sec¬ 
tion m we propose and present our distributed parameter 
estimation algorithm. Finally, we evaluate and compare the 
performance of our proposed algorithms numerically in Sec¬ 
tion [5] and conclude the paper in Section |6l 

2 Framework Overview: System 
Model, Problem Formulation, 
and Related Work 

In this section, we provide a brief overview of the different 
framework components and algorithms that are proposed in 
this paper to perform cloud-assisted remote sensing based 
parameter estimation. We also talk about some previously 
done works that are related to the algorithms proposed in 
this paper. 

2.1 Participatory Sensors 

We consider a swarm of a large number of participatory 
sensors that are managed by a cloud platform. Each sen¬ 
sor of the swarm is cloud accessible via the Internet and 
is assumed to have some sensing capability. We model the 
swarm as an Euclidean geometric random graph Q = {S,L), 
where S' is a set of n sensors, and L is the set of all links 
connecting the sensors, where two sensors are considered to 
be connected if they are within a transmission radius, r, of 
each other. Let loc{i) denote the physical location of sensor 
i, and C{i) denote its sensing capability or capacity {C{i) 
can for e.g. refer to maximum allowed sensing time, maxi¬ 


mum allowed processing power, maximum allowed memory 
capacity, etc.). 

We assume that each sensor i G S is capable of estimating 
a vector of unknown parameters, 9 G R'^, through noisy 
measurements, Xi G R^. That is, 

Xi = HiO + Ui, j = 1,..., n 

where Hi G is sensor i’s sensing model (typically 

known to i only) relating Xi to 9i, and Ui is an additive 
Gaussian noise with zero mean and variance af. We as¬ 
sume that Ui and Uj are independent from one another for 
all G S. Because different sensors may have different 
sensing models and/or different measurement methods, it 
is very likely that different sensors have different estimates 
of 9. Also, we do not assume/require that the sensors are 
synchronized; that is, the consensus algorithms we develop 
in this paper to estimate 9 are asynchronous. 

2.2 Virtual Remote Sensing 

We assume that there exists a cloud manager/agent that 
is responsible for managing the participatory sensors, han¬ 
dling virtual sensing task requests (to be submitted by cloud 
clients), and ensuring that clients’ Service Level Agree¬ 
ments (SLAs) are met once their requests are granted by 
the cloud. Each virtual sensing task request is represented 
by a quadruple, {g, 9, c, 6), where g is the number of (virtual) 
sensors requested to perform the sensing task, 9 G R^ is a 
column vector of unknown parameters to estimate by the 
virtual sensors, and the location c and radius 6 indicate the 
area of interest that needs to be sensed; i.e., all requested 
sensors must be located within S distance from the center c. 
Generally speaking, SLAs consist of: (i) a maximum time 
within which the sensing task must be completed, (ii) an 
absolute tolerance Cabs > 0 of the estimation quality, (iii) a 
relative tolerance Crei > 0 of the estimation quality (maxi¬ 
mum gap between the g sensors’ local estimates of 9), and 
(iv) a maximum rejection rate, defined as the ratio of the 
number of failed virtualizations to the total number of sens¬ 
ing task requests. 

Upon receiving a sensing task request, the cloud agent’s 
job is then to define a set V oi g virtual sensors to be real¬ 
ized by g connected participatory sensors, all located within 
distance 5 from the center c, that can collaboratively and 
distributively estimate 9. Depending on the SLAs of the 
sensing task, the cloud agent needs to determine each of 
the following three parameters before proceeding with sen¬ 
sor network virtualization needed to perform a requested 
sensing task. First, it needs to choose a suitable virtual 
topology that connects the set of virtual sensors, V, so that 
they can perform the sensing task collaboratively. Although 
other topologies can be used, we focus in this paper on three 
types: complete, cyclic, and star. For a given topology, let 
E denote the set of virtual links connecting the virtual sen¬ 
sors and T = (U, E) be the graph representing the virtual 
sensor network. Note that the cloud agent needs to make 
sure that the virtual sensor network is connected according 
to the chosen topology. 
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Another parameter the cloud agent needs to fix and as¬ 
sociate with T is the maximum allowed path length, /i, be¬ 
tween any pair of virtual sensors, h can be viewed as a 
way to limit the number of sensors/hops message exchanges 
among virtual sensors can go through. It is a parameter that 
can be used to impose an upper bound on end-to-end mes¬ 
sage delays, and thus, on the sensing task completion time. 
Note that a virtual link between two virtual sensors may 
be realized by more than two physical sensors, and some 
of these sensors may not necessarily realize a virtual sensor 
in V by itself. That is, some sensors may only be used for 
forwarding traffic without participating in performing the 
sensing task. 

The third parameter the cloud agent needs to choose and 
set is the sensing capacity threshold R{j) for a virtual sensor 
j € V. The capacity C(i) of a participatory sensor i which 
realizes j must then exceeds the capacity threshold R{j). 
This threshold can for e.g. represents the minimum storage 
capacity, the minimum CPU computing power, and/or the 
minimum amount of time required by the sensing task. 

The required values of the parameters associated with T 
including E, R{j), and h will be mainly determined by the 
client’s SLA, and is beyond the scope of this paper. In this 
paper, we focus instead on the design of efficient algorithms 
that meet these design requirements. Our ultimate objective 
in this work is to design a distributed consensus algorithm 
that enables the estimation of the unknown parameter vec¬ 
tor, 9, subject to the design parameter requirements speci¬ 
fied by both the cloud client and the cloud agent. To this 
end, the key tasks that need to be executed by the swarm 
of participatory sensors to perform estimation are: 

2.2.1 Sensor Search 

It consists of searching for the sensors, among all participa¬ 
tory sensors, that meet the T requirements. More specifi¬ 
cally, the swarm searches for a subset of participatory sen¬ 
sors, S' C S, such that a sensor i G S' if: 

i) it can sense and estimate 9 through a sensing model Hi\ 
i.e., it can observe a vector Xt that can be expressed as 
Xi — R-i9 'llii 

ii) it is geographically located within 5 distance from c, 
and 

iii) its capacity C{i) > R{j) for at least one virtual sensor 

J e f". 

For each participatory sensor i, we define the virtual do¬ 
main of f, 11(f), as the set of all virtual sensors that can be 
supported by sensor f; that is, 

p.,) ^ [ O' e ^ : C{z) > Rij)} if II loc{z) - c|| < 5 
'10 otherwise, 

and the objective of this task is to construct, with the min¬ 
imum possible communication overhead, each participatory 
sensor f’s virtual domain, 11(f), and to determine the set 
S' as fast as possible, all without assuming prior knowledge 


of the G topology. Our proposed technique for performing 
such a task is presented in Section [3l 

Related work. In a recent work, Perera et. al [muz] de¬ 
scribed a system of context-aware sensor search to address 
the research challenges of searching for sensors when large 
numbers of sensors with overlapping and redundant func¬ 
tionality are available to the cloud. Such a sensor search 
approach suffers from practical limitations as it relies on 
centralized knowledge of all available sensors and requires 
continuous tracking of the sensors’ dynamics, such as the 
sensors’ availability, connectivity, and mobility. 

Sensor search algorithms need to be simple, have bounded 
search latency, and incur minimum communication overhead 
between the cloud platform and the large number of partic¬ 
ipatory sensors. Unlike centralized resource discovery al¬ 
gorithms [3], gossip-based search protocols are distributed 
and topology-independent, which are more suitable for sen¬ 
sor search in the loT context. Gossip protocols are origi¬ 
nally designed for information dissemination [4], and have 
been demonstrated to be effective in resource discovery in 
Peer-to-Peer (P2P) networks [9]. Our proposed framework 
relies on gossip techniques to determine the set of sensors 
among all participatory sensors that are capable of perform¬ 
ing sensing and are willing to participate in the formation 
of a virtual sensor network. 


2.2.2 Sensor Network Virtualization 

This virtualization task consists of finding (i) a set A C S' 
of exactly g connected sensors selected among all sensors in 
S' (the g selected sensors should be connected according to 
the virtual topology chosen by the cloud agent) and {ii) a 
set Ma C {{i,j) gAxV : j G T’(i)} of one-to-one mapped 
pairs (each participatory sensor in A is mapped to one and 
only one virtual sensor in V) such that the length, h{i, i'), of 
any simple path connecting two distinct participatory sen¬ 
sors i,i' in A mapping a pair of directly connected virtual 
sensors {j,j') G E is less than or equal to h. We refer to 
a possible {A,A4a} pair as a feasible virtualization of the 
requested virtual sensor network T. Note that for any possi¬ 
ble set A, there could exit multiple possible sets, Ma, each 
can form a feasible virtulization when paired with A, and 
the objective of a sensor network virtualization algorithm is 
then to find the ’optimal’ feasible virtualization, {A, Ma}*■ 

We now define and introduce what an ’optimal’ feasi¬ 
ble virtualization means. We consider that the cost (to 
the cloud) of virtualizing a virtual sensor network request, 
T = {V,E), is determined by the amounts of requested 
resources and given by Cost(T) = q;|U| -I- /3\E\, where a 
denotes an incentive paid by the cloud to each participa¬ 
tory sensor, and j3 denotes an incentive associated with each 
physical path between each pair of participatory sensors in 
A. An incentive could be monetary or could be in any other 
form (e.g., credits, services, etc.). On the other hand, the 
total benefit (to the cloud/swarm of sensors) resulting from 
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a feasible virtulization, {A,Ma}, can be expressed as 


Benefit = 


^ C{i) - R{j) 


+ E 

ii,i')eP 


h — h{i, i') 
h ^ 


( 2 ) 

where h{i, i') is again the path length (in number of hops) of 
the path connecting the sensors of the pair mapping 

the virtual link between j and and P = {{i,i’) G Ax 
A : (*',/) G G E} denotes the set of all 

such pairs. Note that for a given sensing task request (i.e., 
for a given virtualization cost), the lesser the used physical 
resources, the higher the total benefit. The sensor network 
virtualization algorithm that we propose consists then of 
finding a feasible virtualization that maximizes the total 
benefit given in Eq. We refer to the optimal solution 
as {A,Ma}*- Clearly, finding {A,Ma}* is a hard problem 
due to the factorial size of the solution space (in n). Our 
first contribution in this work is to develop an algorithm that 
solves this virtualization problem efficiently. The algorithm 
is presented in Section [3l 

Related work. Network virtualization techniques pro¬ 
posed in the past decade consist mainly of virtual network 
embedding algorithms, which instantiate virtual networks 
on substrate infrastructures [IQIIS]. Most of these virtual 
network embedding algorithms are centralized (e.g. H) due 
to the ease of deployment of centralized approaches in cloud 
platforms where the cloud provider desires to have full con¬ 
trol on the physical network resources. 

Distributed virtual network embedding and virtualization 
algorithms have also been proposed in literature mm- One 
of the limitation of the algorithm proposed in [ 12 ] lies in its 
unsuitability for swarm virtualization, as the authors as¬ 
sume unlimited physical resources and consider an offline 
resource virtualization approach. As for the more recent 
work proposed in ( 2 , although the virtualization phase of 
the physical network does not require full/global knowledge 
of the swarm, the cloud must initially partition the swarm 
into hierarchies and delegate each virtual network request to 
a different hierarchy, which also requires full knowledge but 
about the sensors in each hierarchy. Unlike this approach, 
our proposed virtualization algorithm is fully distributed. 
In addition, it does not require any synchronization among 
sensors. 


2.2.3 Distributed Consensus Estimation 

This task relies on the virtual sensor network (formed in the 
previous step) to provide an estimate of 9 that is at most 
Cabs from the optimal, and that all the virtualized sensors 
consent to the same estimate value of 9 with a tolerance of 

^rel- 

Without loss of generality, consider indexing the selected 
g sensors in the virtual sensor network as 1 ... 5 and let x = 
H = andu= [ 7 x 7 ,...,^^]^. 

The aggregate measurements can then be written as a: = 
H9 -|- u. One simple approach of estimating 9 is to have 
the cloud agent first collects from each virtual sensor i its 


measurement vector, Xi, and its sensing model. Hi, and then 
solves the following Least Squares (LS) problem 

minimize ^||a; —(3) 

where 9 is here the optimization variable. The unbiased 
maximum-likelihood (ML) estimate of 9 is simply 0 ls = 
H^x. 

This centralized LS approach, though simple, requires 
that each virtualized sensor exchanges its measurement vec¬ 
tor and its sensing model with the cloud agent, which can 
create significant communication overhead, especially when 
the number of measurements, M, and the number of virtual 
sensors, g, are large. 

We instead propose, in this paper, a decentralized ap¬ 
proach that relies on the virtual sensor network to provide 
an estimation of the parameter vector 9. We rely on the re¬ 
cent results presented in [ 2 ^ to develop our distributed es¬ 
timation algorithm, which reduces communication overhead 
significantly when compared to the conventional ADMM 
approach |16] in addition to not requiring synchronization 
among sensors. The proposed algorithm is presented in Sec¬ 
tion |4| 

Related work. Distributed parameter estimation ap¬ 
proaches have been proposed in miEoKn]. Estimation can 
for e.g. be carried out by first computing a local estimate at 
each virtual sensor and then perform a distributed weighted 
average of the local estimates EQ). This approach results in 
an ML estimate, but does not limit/bound the variation be¬ 
tween mean square errors of local estimates. More recently, 
Paul et al. |16] propose a distributed estimation algorithm 
based on ADMM. Although this approach results in an op¬ 
timal mean square error when compared to LS, it exhibits 
a significant in-network communication overhead that re¬ 
quires even more messages to be exchanged among sensors 
than that exchanged in the centralized LS. One approach 
also proposed in m to overcome this problem is to ap¬ 
proximate the computation of primal and dual variables at 
each step of the algorithm by using earlier versions of these 
variables instead of sharing them at each iteration which 
marginally reduces the communication overhead. In addi¬ 
tion to the increased communication overhead, conventional 
ADMM requires synchronous operation of the sensors. This 
is very challenging from a practical viewpoint, and does not 
scale well especially when applied in the loT context. It has 
been shown recently that an asynchronous implementation 
of ADMM has 0(1/k) convergence [Hj. 

Not only is our proposed estimation algorithm both asyn¬ 
chronous and distributed, but also reduces communication 
overhead significantly when compared to the conventional 
ADMM approach [16]. 

3 Sensor Network Virtualization 

We begin by presenting our proposed Randomized and 
Asynchronous Distributed Virtualization (RADV) algo¬ 
rithm, which consists of four phases: (/) searching for sen¬ 
sors that can support a virtual sensing task request T, (//) 
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pruning of virtual domains 'D(i) for all i G S, {HI) con¬ 
struction of benefit matrices in a distributed manner, and 
{IV) solving assignment problems at virtual sensors. This 
approach results in multiple solutions each evaluated by a 
different sensor, and the cloud agent selects the solution 
with the maximum benefit. 

We design time-invariant gossip based algorithms for the 
first three phases in which sensors exchange information ran¬ 
domly mi]. At the /c-th time slot, let sensor i be active 
and contact its neighbor sensor j (i.e., {i,j) G L) with prob¬ 
ability Tij > 0 only if j can be contacted by more than one 
of its neighbors. The probability Ti^i denotes the probabil¬ 
ity that i does not contact any other sensor. Let the n x n 
matrix T = [Tij] be a doubly stochastic transition matrix 
of non-negative entries [5]. A natural choice of Tij is 

if * = j or (f,j) G L, 
di + 1 (4) 

0 , otherwise, 

where di = |{j G S : {i,j) G L}\ is the degree of sensor i. 

We now present each of the four phases in greater details. 

Phase I—Sensor Search 

The objective of this phase is to construct 2?(s) for all s G S' 
and S' as fast as possible without assuming prior knowledge 
of the G topology. First, the cloud initiates the search at 
time fc = 0 by sending T to one or more arbitrary sensors. 
At any later time fc, s and its neighbor s' exchange informa¬ 
tion as follows, s pushes T to s' only if s' does not have T, 
or pulls it from s' only if s does not have T. If s contacts s' 
and both s and s' have received T before, s stops contacting 
any other sensor. Upon receipt of T, a sensor s constructs 
T’(s) according to Eq.([T|) and S' as 

S' = {sGS |T'(s)| > 0}. 


in Eq. o, s contacts only one of its neighbors s' at time 
fc. Then, for &\\T>{i) G Dg : i ^ s', s pushes V{i) to s' 
only if s' did not receive 2?(i) before and h{i, s) < h. Also, 
for a\\T>{i) G Dg' : i ^ s, s pulls 'D{i) from s' only if s did 
not receive T>{i) before and h{i, s') < h. If no information 
is exchanged between s and s' at time fc, s stops contacting 
any of its neighbors. However, s may restart contacting its 
neighbors again if it updated Dg after time fc -|- I. 

When s constructs its Dg, it starts by pruning D{s). The 
pruning is performed by deleting a virtual sensor j G D{s) 
(i.e., D{s) G- V{s) \ {j}) if none of the virtual sensors that 
are connected to j, {j' G V : {j,j') G E}, is not included 
in any received D{i), i.e. j ^ T>{i) : V{i) G Dg. This prun¬ 
ing rule ensures that the virtualized sensors maintain the 
required topology E and the constructed benefit matrices 
shall result in a feasible virtualization. 

Phase III—Construction of Benefit Matrices 

As mentioned earlier, finding a feasible virtualization, 
{A, Ma}* , that maximizes the total benefit given in Eq. (I2|) 
is a hard problem due to the large size of the solution 
space. Therefore, this phase proposes an efficient way of 
solving this virtualization problem. Specifically, we propose 
a method that solves this problem in a distributed manner 
and without requiring any synchronization among sensors, 
as described next. 

During this phase, each sensor s locally constructs its own 
set, , of g sensors that s chooses as virtualized sensors to 
assign to virtual sensors in V. Each sensor s also maintains 
g row vectors, G R^^®andi G that we define 

as the benefit vector of sensor i seen by s, where the j-th 
(s') 

element, B) -, denotes the benefit of assigning participatory 
sensor i G A*-®^ to the virtual sensor j gV as seen by s, and 
is given by 


A virtual domain of sensor i, D{i), evaluated during the 
sensor search phase is not sufficient to tell whether a feasible 
embedding can be found if sensor i virtualizes (is mapped 
to) j. For example, the virtualization in which sensor i 
virtualizes j and sensor i' virtualizes j' where there is a 
virtual link {j,j') is not feasible if i' is not reachable from 
i and vice versa. When this situation occurs, we say that 
sensor i is incapable of supporting the topology requirement 
specified by E, initiating thus a virtual domain pruning. 

Phase II—Virtual Domain Pruning 

During this phase, we ensure that all virtualized sensors 
maintain the topology E by allowing a sensor to receive 
the virtual domains of other sensors and delete a virtual 
sensor j from its domain if there exists a virtual link {j, j') 
such that j' is not included in any other received domains. 
Let Dg C {D{i) : i G S} denotes the set of domains that 
sensor s has at time fc. Initially Dg = {D(s)} and h{i, s) = 
0 for all i G 50. Using the same transition matrix, T, defined 

^Knowledge about other sensors existence is not needed, and h is 
typically evaluated dynamically. 


d(^) _ 

- 


a 


C{i) - R{j) q h-h{j, s) 
C{i) ^ h 


0 


if j G V{i), 
otherwise. 


Our objective is then to construct, for each s G S, the 
benefit matrix as fast as possible, and find 

a feasible virtualization, {A, Ma}, that maximizes the total 
benefit. 


E 


B. 


{s) 


among all s G S' without knowing the G structure. More¬ 
over, the path length between a sensor s and any other sen¬ 
sor i that s includes in its benefit matrix must not exceed 
h. Finally, a sensor s shall include only the benefit vectors 
of the g sensors with the largest possible benefit. 

Each sensor s initially sets A^®) = A^®) U {s} if D{s) ^ 0, 
sets h{i, s) = Ofor alH G S, and sets 

C{s) - R{j) , ^ ^ ^ 


. 0 , 


otherwise. 
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Also, s maintains a scalar, 6 ™'", defined as the minimum to¬ 
tal benefit it has received from any other sensor and written 
as 

j&v 

and the corresponding sensor, 

zf" = argmin^sg. 

* jev 


Initially, = 0 and remains so until = g. 

Using the same transition matrix, T, defined in Eq. ®, 
s contacts its neighbor s' only once at each time k. Then, 
for alH € : i ^ s', s pushes the benefit vector to 

s' only if h(J,, s) < h and 


E 

jev 




> b 


min 

s' 


Also, for alii G A^® '> : i ^ s, s pulls the benefit vector ^ 
from s' only if h{i, s') < h and 


E 

j^v 




> b 


min 

s 


If no information is exchanged between s and s' at time k, 
s stops contacting its neighbors at time k + 1. However, s 
may restart contacting its neighbors again if B^"^ is updated 
after time fc -I- 1. 

When s receives \ s updates B^j as 

I 0 otherwise. 


If i ^ A(®\ then we have two scenarios. In the first sce¬ 
nario, s still has not received g benefit vectors, so 6 ™'" = 0 
and IA^®) \ < g, then s updates its set of candidate sensors as 
A(®) = A^®) U {*}. In the other scenario in which | A^®^ | = g, 
s replaces the sensor corresponding to the minimum total 
benefit, C“, with i so that A^®) = A^®) \ {C“} U {i}- 
On the other hand, if i G A^®), then s updates B^^j if 

^ Finally, s updates f™'", and 

j&v ’ j&v ’ 

h{i, s) as h(i, s) = h(i, s') + 1. 

Finding a feasible virtualization that maximizes the ben¬ 
efit H^®) = [ ] instead of the benefit given in Eq. ([2]) 

makes the problem easier because every sensor has a differ¬ 
ent value for the benefit Bi^j that depends only on the length 
of the physical path between i and s instead of the path 
lengths of all possible combinations of sensor pairs {i,i') that 
can virtualize a virtual link. Intuitively, this relaxation still 
leads to an optimal or near optimal virtualization, because 
for a connected swarm Q, the number of sensors that are 
directly connected by a single physical link (clique) grows 
logarithmically in n and hence this number is larger than g 
almost surely as 5 <C n. In such a case, it is sufficient to 


ensure that the length of the paths between i and s and be¬ 
tween i' and s are the shortest possible ones to ensure that 
the length of the path between i and i' is also the short¬ 
est, as in this case, s, i, and i' reside in the same clique 
with high probability. We evaluate the effectiveness of this 
relaxation in Section [5] and show that our virtualization al¬ 
gorithm performs well even when the condition g <^n does 
not hold. 

Phase IV—Solving Local Assignment Problem 

After reception of the g benefit vectors, s proceeds to this 
phase of the algorithm only if it stops communicating and 
= 9- Each sensor s G S with jA*-®^! = g solves locally 
the following assignment problem: 

maximize 

ieAG) j&V{i) 

subject to ^ rriij = 1, i G A^®), 

(5) 

E = 1, j e V, 

{i-.jeT’(i)} 

'^ij G {0,1}, 

where rriij are binary optimization variables indicating 
whether the participatory sensor i is assigned to the virtual 
sensor j. The problem formulated in ([S]) is equivalent to the 
maximum weight matching perfect problem in a bipartite 
graph, and hence, we propose to use the classical Hungar¬ 
ian method to solve it (the worst case time complexity is 
0(g3) [13 [H). 

We can also tolerate an error e > 0 of the resulting total 
beneht and relax the restriction of finding a perfect match¬ 
ing for large g. This relaxation is reasonable when there are 
enough sensors involved in solving these local optimization 
problems, as in this case we can pick the best solution and 
discard those without a perfect matching. In such a sce¬ 
nario, we can also use a linear time (I — e)-approximation 
algorithm to solve In this paper, we use the Hungar¬ 

ian method to solve our formulated optimization problems. 
Details of the algorithm are omitted due to space limitation; 
readers are referred to [T3II1I1] for detailed information. 

Each sensor solves locally the optimization problem given 
in ([5|) and sends its obtained solution to the cloud agent. 
This is done asynchronously. The cloud agent then selects 
the solution that leads to the maximum total benefit, and 
keeps all other solutions for later use in the event that the 
network dynamics invalidate the selected solution before the 
virtual sensing task completes. 

Complexity and message overhead. The time required 
to spread T across the network is 0{r~^ logn) [13 . It takes 
0{g) worst case time to evaluate P(i) locally at sensor i. 
Also, the time required to spread information in the prun¬ 
ing and benefit construction phases is 0{r~^nlogn). The 
pruning of the virtual domain 'D{i) requires node i to ex¬ 
amine g received virtual domains, each having at most g 
entries. The worst case local running time of pruning is 
then 0{g^). Finally, the local running time of the Hungar¬ 
ian method is 0{g^). Hence, the overall complexity of is 
0 (max{r“^nlogn, g^}). 
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The average number of messages communicated per sen¬ 
sor during the sensor search phase is 0 ( 1 ) and each message 
is 0{g) in size. During pruning of virtual domains, since ev¬ 
ery sensor exchanges a maximum of n domains each of size 
that is also 0{g), the average number of messages communi¬ 
cated per sensor is 0{n). However, because we restrict that 
messages be communicated up to h hops for only a group 
of sensors that support the requirements of T, the average 
number of messages per sensor is typically small. Figure [T] 
shows the total time and the average number of messages 
per sensor required during both the domain pruning and 
the benefit construction phases. The total time growth is 
linearithmic in n when T is sent to exactly one sensor and 
when Q is connected. This time can, in practice, be de¬ 
creased significantly if T is initially sent to multiple sensors. 
Additionally, the average number of messages per sensor is 
shown to scale linearly with n, and is typically a very small 
fraction of n. 
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Figure 1: Time (in number of iterations) and message overhead (in number 
of communicated messages) resulting from constructing the benefit matri¬ 
ces. 


4 Distributed Estimation of 9 

After completing the sensor virtualization task, using the 
proposed RADV algorithm that we described in Section |31 
the virtual sensors run an in-network parameter estimation 
algorithm to compute 0 distributedly. In this section, we 
present our proposed Randomized and Asynchronous Dis¬ 
tributed Estimation (RADE) algorithm. We first follow the 
standard ADMM approach to derive primal, dual and La- 
grangian variable update equations, then we describe the 
proposed RADE algorithm. Eor clarity of notation, in what 
follows, we refer to the set of g selected participatory sensors, 
determined by means of the proposed RADV algorithm, 
simply as A. 

The centralized estimation approach given in ([3]) is first 
decomposed into g local estimates of 9 (one 9i for each i £ 
A) while constraining the local estimates with the coupling 
constraints 9i = for all (*, j) £ P. This results in the 
following optimization problem: 

minimize ^ 11 ^* “ HiOiW^ 

i&A ( 6 ) 

subject to 9i — 9j = 0 for all {i,j) £ P, 


where {9i,i £ A} are the optimization variables. 

By introducing an auxiliary variable, z, we decouple the 
constraints in (0), so that — z = 0 for all f £ A nnj. 
However, this requires that z be shared among all g sen¬ 
sors. Instead, we introduce g auxiliary variables, Zj, and 
equivalently write the optimization problem as 

minimize i ^ ||a;i — Hi9i\\‘^ 

ieA (7) 

subject to 9j — Zi = 0 for all (i,J) £ P. 

Let A = {Aij £ : {i,j) £ P} and p = {pij £ R : 

(i, j) £ P} denote respectively the set of Lagrangian mul¬ 
tipliers and the set of penalty parameters. The augmented 
Lagrangian is 


£p( 6 »,z, A) 


E 

ieA 



- E AT^( 0 . 

3&A-.(i,j)GP 


+ E 

3eA:(i,j)&P 


Zj) 



( 8 ) 


By setting the gradient w.r.t 9i of Eq. ([5]) to zero and solving 
for 9i, we get 


9. = Hi m 


E Pi,3^ 

3eA-.{i,j)eP ) 


■ I Hjxi -|- E (Ai,i + Pi,jZj) I . 
\ 3eA:{i,3)eP ) 


Similarly, we solve for Zj by setting the gradient w.r.t to 
Zi to zero and rearranging the indices of the Lagrangian 
multipliers and the penalty parameters. It follows that 


-Zi = 


- E 

^ j&A-.{i,3)eP 



The former analysis leads to the conventional ADMM- 
based distributed consensus estimation algorithm given by 



where the superscript k denotes the value of the variable 
at the fc-th iteration. This conventional ADMM algorithm, 
given in ( 0 , requires synchronization and variable update 
among the sensors 0121 ]. Moreover, at each iteration fc, 
each sensor i must send z|^^ and to all other sensors it 
is connected to, so as to evaluate their fc -|- 1 primal, dual, 
and Lagrangian multipliers. When M is small, this algo¬ 
rithm incurs communication overhead that can be shown 
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to be worse than the communication overhead incurred by 
centralized estimation methods. However, when M is large, 
the conventional ADMM algorithm incurs lesser communi¬ 
cation overhead than what centralized estimation methods 
incur, but it still remains practically unattractive due to 
other weaknesses, detailed later in Section [51 

Given the absolute and relative tolerances, Cabs and Crei, 
specified by the SLAs between the cloud client and the cloud 
agent, we define the primal and dual tolerances, controlling 
the convergence of the algorithm at iteration fc, as 

er*(fc) = \/5eabs + Creimaxdl^f^ll, || - zf^||). 



-10 -8 -6 -4 -2 0 2 4 

Noise Power (dB) 


and 

^ ^ WPjAjAV 

jGA 

The tolerances, and define the stopping criteria 

of sensor i; i.e., sensor i stops updating 6i and Zi when 



( 10 ) 


( 11 ) 


The stopping criteria of RADE are different from those of 
the conventional ADMM. Unlike the conventional ADMM 
where all sensors shall stop computations all at the same 
time using a common stopping criteria and common pri¬ 
mal and dual tolerances, the stopping criteria m and m 
of RADE allow a sensor i to stop its computations asyn¬ 
chronously and independently from other sensors. However, 
these criteria are not enough to ensure asynchronous im¬ 
plementation, as synchronization is still required for dual 
and primal variable updates at iteration k + 1 due to their 
dependencies on k. 

To ensure full asynchronous implementation, we use the 
doubly stochastic transition matrix, T G where Tij 

is the probability that a sensor i contacts another sensor 
j at any iteration, for deciding the communications among 
sensors. We can have 

- a i= j 01 {i,j) G P, 

d'i + 1 

0 otherwise, 

where d- = \{j G A : (i,j) G P}| is the degree of the virtual 
sensor, in T, that i virtualizes. At iteration fc -I- 1, sensor 
i may need to contact only one sensor j, unless both of 
i’s stopping criteria, (nnD and (EH), are already satisfied. 
Whereas sensor j can be contacted by more than one sensor 
if j is not contacting any other sensor, even when both of 
j’s stopping criteria are satisfied. 

(k) 

Upon contacting j, sensor i pushes 0] ' to j only if i’s 
primal stopping condition is not satisfied and pushes to 
j only if i’s dual stopping condition is not satisfied. Also, i 
pulls 0), from j only if j’s primal stopping condition is not 

satisfied and pulls Zj only if j’s dual stopping condition 
is not satisfied. Finally, both i and j update their fc -|- 1 


Figure 2: MSE of RADE compared to those achieved under ADMM and LS 
at different noise power and for different virtual sensor network topologies. 

variables using the most recent values they received from 
other sensors. 

Mean square error and convergence. The asyn¬ 
chronous and randomization design features of RADE do 
not impact the Mean Square Error (MSE) achieved by 
RADE when compared to ADMM. This is explained as fol¬ 
lows. In both ADMM and RADE, the number of necessary 
dual and primal variables updates required until conver¬ 
gence remains unchanged, so that convergence to the same 
estimate is guaranteed in both algorithms. Figure [2] shows 
the MSE achievable under both RADE and ADMM when 
compared to LS under each of the three studied sensor net¬ 
work topologies: complete, star, and cycle. These results 
show the optimality of RADE that we intuitively discussed. 
All approaches have the same accuracy. But of course each 
of them does so at a different performance cost, as will be 
discussed later. 

On the other hand, RADE exhibits a linear convergence 
rate {0(1/k)), similar to what the conventional ADMM 
does. Figure [5] shows the number of time steps required 
for both RADE and ADMM to converge under different rel¬ 
ative tolerance parameters, Crei- RADE convergence tends 
to be more restricted by the randomization nature of the 
algorithm for smaller values of Crei, which can be seen by 
the increasing number of steps as g increases if frei = 10“^. 
ADMM generally requires a lesser number of steps to con¬ 
verge by relaxing the consensus constraint (through reduc¬ 
ing Crei). However, as will be seen in the numerical results 
section later, this increase in the number of convergence 
steps is acceptable when considering the amount of commu¬ 
nication overhead that the algorithm saves. 

5 Numerical Results 

In this section, we evaluate the performance of the proposed 
RADV and RADE algorithms through simulations. In our 
simulations, the swarm of sensors, Q, and the virtual sens¬ 
ing task requests, T, are generated using the parameters 
summarized in Table |TJ The virtual sensor network topol¬ 
ogy can either be complete, cyclic, or star, with a randomly 
chosen central location, c. We consider receiving and ser¬ 
vicing only one virtual sensing task request at a time. The 







Figure 3: Number of time steps (k) needed until convergence of RADE 
when compared to ADMM for complete topology under different relative 
tolerance values €rei- 
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Figure 5: Virtualization cost of RADV when compared to the upper bound 
under different topologies. 
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Figure 4: Rejection rate encountered at different n. 

absolute and relative tolerances, Cabs and Crei, are set to 10“^ 
unless specified otherwise. 

Table 1: Simulation Parameters 


Figure |4] shows the rejection rate encountered with dif¬ 
ferent T topologies and n values. As we only consider one 
single request at a time, the results shown in this figure re¬ 
flect mainly the impact of the virtual sensor network topol¬ 
ogy, the number of sensors n, and the simulations parame¬ 
ters given in Table [T] on the rejection rate. The denser the 
swarm of sensors is, the lower the rejection rate is, implying 
that the cloud is capable of granting higher number of re¬ 
quests. As a marginal note, a star topology is slightly easier 
to virtualize than a complete or a cyclic topology. 

One way of assessing the effectiveness of the virtualization 
algorithm is by measuring the difference between the total 
virtualization benefit given in ([5]) and the cost associated 
with the sensor virtualization introduced in Section 12.2.21 
For a given number of virtual sensors, the cost is mainly 
determined by the choice of the topology (star topology has 
the lowest cost and complete topology has the highest one). 
For a given topology, the total benefit is maximized when 
each virtual sensor is assigned to the participatory sensor 
with the maximum capacity and each virtual link is mapped 
to exactly one physical link. We refer to this maximized 
benefit as the upper bound. 



Parameter 

r 

c{i) 

m 

d 

h 

Value 

0.1 

- 1/(50,100) 

- 1/(25,50) 

0.2 

20 


Figure 6: Number of time steps until convergence of RADE when compared 
to ADMM under different topologies. 

In Figure [51 we evaluate the virtualization effectiveness 
achieved by RADV under different virtual topologies. As 
the swarm gets denser, RADV achieves a Total Benefit — 
Cost that is very close to the upper bound. Since the lowest 


ing task in a star or a cyclic topology. On the other hand, 
convergence and communication overhead of the distributed 
estimation is also impacted by the cloud agent’s choice of 
the virtual topology. This creates a design trade-off, as we 
will see in the next two paragraphs. 

Figure [6] shows the impact of the virtual topology choice 
on the convergence performance of RADE when compared 
to ADMM. If g is small (three to eight), the impact of the 
virtual topology on convergence of RADE and ADMM is 
minimal. This is because the degree of parallelism (number 
of sensors active at the same time) is more restricted by the 
small number of virtual sensors g. In such a scenario, it is 
convenient for the cloud agent to always arrange the vir¬ 
tual sensors in a star topology. However, as g increases, the 
impact of the virtual topology becomes significant as the 
degree of parallelism is higher in a complete topology, en¬ 
abling RADE to converge much faster as g gets larger. This 
convergence becomes slower with star and cyclic topologies. 
This is because in star and cyclic topologies, only few sensors 
are active at a time, making RADE and ADMM converge 
in a number of steps comparable to that of the ADMM’s 
sequential implementation. In this later scenario, the cloud 
agent shall arrange the virtual sensors as a complete topol- 
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Figure 7: Communication overhead when comparing RADE to ADMM and 
LS under different g values. 

ogy unless the SLA permits slower convergence. 

Moreover, RADE converges in a higher number of steps 
when compared to the conventional ADMM. This is be¬ 
cause in ADMM, all sensors are active at each time, and a 
sensor exchanges its updated variables with all of its neigh¬ 
bors, whereas in RADE, only disjoint sensor pairs are active 
at a time and variables are updated only between pairs of 
sensors. Nevertheless, we argue that this loss in speed of 
convergence for RADE is marginal when compared to the 
significant savings in communication overhead. 

Eigure [7] shows the total number of 0{N) sized mes¬ 
sages exchanged during estimation when comparing RADE, 
ADMM, and LS for M = 100. The number of messages 
exchanged by RADE is at least an order of magnitude less 
than the number of messages generated under ADMM. Also 
the communication overhead of RADE is less than the cen¬ 
tralized LS especially as M becomes large. This savings in 
communication overhead is attributed to the asynchronous 
design of RADE in which messages among sensors are only 
exchanged if new values of a primal or dual variables are 
changed away from their specified tolerances. 

6 Conclusion and Discussion 

We propose cloud-based remote sensing algorithms for en¬ 
abling distributed estimation of unknown parameters via 
sensor network virtualization. The algorithm has the fol- [11] 
lowing phases: sensor search, domain pruning, benefit ma¬ 
trix construction, virtual-participatory sensor assignment 
solver, and distributed estimation. Using simulation, we 
show that the proposed algorithms reduce communication 
overhead significantly without compromising the estimation 
error when compared to the traditional ADMM algorithm. 

We also show that the convergence time of our proposed 
algorithms maintain linear convergence behavior, as in the 
case of conventional ADMM. 
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